← Back to Presentation
Code Walkthrough

Notebook Structure Explained

Understanding the imports, helper functions, data loading, and training loop that orchestrate the gradient descent algorithm

📑 Table of Contents

1

Import Statements

These lines import the essential Python libraries for numerical computing, data handling, and visualization.

imports.py
1import matplotlib.pyplot as plt
2import numpy as np
3import pandas as pd
Line 1 import matplotlib.pyplot as plt

📊 Matplotlib - Visualization Library

Imports the pyplot module from Matplotlib for creating plots.

  • matplotlib — Python's most popular plotting library
  • pyplot — Provides a MATLAB-like interface for simple plotting
  • as plt — Creates a short alias so we can write plt.plot()

Used for: Plotting data points, decision boundaries, and error curves

Line 2 import numpy as np

🔢 NumPy - Numerical Computing

Imports NumPy, the fundamental package for scientific computing in Python.

  • Provides multi-dimensional arrays (faster than Python lists)
  • Mathematical functions: np.exp(), np.log(), np.dot()
  • Random number generation: np.random
  • Array operations are vectorized (operate on entire arrays at once)

NumPy is the backbone of machine learning in Python. Almost every ML library is built on top of it!

Line 3 import pandas as pd

📋 Pandas - Data Analysis

Imports Pandas for reading and manipulating tabular data.

  • Provides DataFrames — like spreadsheets in Python
  • Easy CSV file reading with pd.read_csv()
  • Powerful data selection and filtering

Used for: Loading the training data from data.csv

2

Visualization Helpers

These provided functions handle plotting data points and decision boundaries. You don't need to modify them.

plot_points.py
1def plot_points(X, y):
2    admitted = X[np.argwhere(y==1)]
3    rejected = X[np.argwhere(y==0)]
4    plt.scatter([s[0][0] for s in rejected], ..., color='blue')
5    plt.scatter([s[0][0] for s in admitted], ..., color='red')
Lines 1-5 plot_points(X, y)

🔴🔵 Plot Data Points by Class

This function separates data into two classes and plots them with different colors.

  • np.argwhere(y==1) — Finds indices where label is 1
  • X[...] — Selects rows from X at those indices
  • Blue dots = Class 0 (rejected)
  • Red dots = Class 1 (admitted)
display.py
1def display(m, b, color='g--'):
2    plt.xlim(-0.05, 1.05)
3    plt.ylim(-0.05, 1.05)
4    x = np.arange(-10, 10, 0.1)
5    plt.plot(x, m*x+b, color)
Lines 1-5 display(m, b, color='g--')

📈 Plot Decision Boundary Line

Draws the decision boundary using slope-intercept form: y = mx + b

  • m — Slope of the line
  • b — Y-intercept
  • 'g--' — Green dashed line (default)
  • xlim/ylim — Sets visible range to [0, 1] with padding
3

Loading the Data

These lines read the training data from a CSV file and prepare it for the algorithm.

load_data.py
1data = pd.read_csv('data.csv', header=None)
2X = np.array(data[[0,1]])
3y = np.array(data[2])
4plot_points(X, y)
5plt.show()
Line 1 data = pd.read_csv('data.csv', header=None)

📁 Read CSV File

Loads the training data from a CSV file into a Pandas DataFrame.

  • 'data.csv' — Filename containing our training examples
  • header=None — File has no header row, just data
  • Result: DataFrame with columns accessed by index (0, 1, 2)
CSV Structure
Column 0: Feature x₁ (e.g., test score 1)
Column 1: Feature x₂ (e.g., test score 2)
Column 2: Label y (0 = rejected, 1 = admitted)
Line 2 X = np.array(data[[0,1]])

📊 Extract Feature Matrix

Selects the input features and converts to a NumPy array.

  • data[[0,1]] — Selects columns 0 and 1
  • np.array() — Converts to NumPy array for fast math
  • Result shape: (n_samples, 2) — each row is [x₁, x₂]

Convention: Capital X for the feature matrix, lowercase x for a single sample.

Line 3 y = np.array(data[2])

🏷️ Extract Labels

Gets the target labels (ground truth) as a NumPy array.

  • data[2] — Selects column 2 (labels)
  • Result shape: (n_samples,) — 1D array of 0s and 1s
Lines 4-5 plot_points(X, y) / plt.show()

👁️ Visualize the Data

Plots the data to see what we're working with before training.

4

Hyperparameters

These settings control how the gradient descent algorithm behaves during training.

config.py
1np.random.seed(44)
2
3epochs = 100
4learnrate = 0.01
Line 1 np.random.seed(44)

🎲 Set Random Seed

Makes random number generation reproducible.

  • Weights are initialized randomly
  • Setting a seed ensures same "random" numbers each run
  • Allows you to reproduce results exactly
Line 3 epochs = 100

🔄 Number of Epochs

How many times to iterate through the entire training set.

  • 1 epoch = 1 complete pass through all training samples
  • More epochs → more learning (up to a point)
Line 4 learnrate = 0.01

📏 Learning Rate (α)

Controls the step size for each weight update.

  • α = 0.01 is a common starting point
  • Too high → Overshoots, diverges
  • Too low → Very slow convergence

The learning rate is crucial! Experiment with different values to see how it affects training.

5

Training Loop

The main function that orchestrates gradient descent, calling your four implemented functions.

train.py
 1def train(features, targets, epochs, learnrate, graph_lines=False):
 2    errors = []
 3    n_records, n_features = features.shape
 4    last_loss = None
 5    weights = np.random.normal(scale=1/n_features**.5, size=n_features)
 6    bias = 0
 7    for e in range(epochs):
 8        for x, y in zip(features, targets):
 9            weights, bias = update_weights(x, y, weights, bias, learnrate)
10        out = output_formula(features, weights, bias)
11        loss = np.mean(error_formula(targets, out))
12        errors.append(loss)
Line 5 weights = np.random.normal(scale=1/n_features**.5, size=n_features)

⚖️ Initialize Weights (Xavier Initialization)

Creates random initial weights from a normal distribution.

  • np.random.normal() — Draws from Gaussian distribution
  • scale=1/√n_features — Xavier initialization (keeps values reasonable)
  • size=n_features — One weight per input feature

Xavier initialization prevents vanishing/exploding gradients by scaling weights appropriately.

Lines 7-9 for e in range(epochs): for x, y in zip(...): update_weights(...)

🔄 The Training Loops (Stochastic Gradient Descent)

Two nested loops that perform stochastic gradient descent:

  • Outer loop (Line 7): Iterates through epochs
  • Inner loop (Line 8): Iterates through each sample
  • zip(features, targets) — Pairs each input with its label
  • Line 9: update_weights()YOUR function updates weights for each sample

This is Stochastic GD — weights update after EACH sample, not after seeing all samples.

Lines 10-12 out = output_formula(...) / loss = np.mean(error_formula(...))

📉 Calculate and Track Loss

After each epoch, compute the average loss to monitor progress.

  • output_formula()YOUR function: Gets predictions for all samples
  • error_formula()YOUR function: Computes error for each sample
  • np.mean() — Averages errors across all samples
6

Running the Training

run.py
1train(X, y, epochs, learnrate, True)
Line 1 train(X, y, epochs, learnrate, True)

🚀 Start Training!

Calls the training function with all our prepared data and settings.

What you'll see:

  • Loss and accuracy printed every 10 epochs
  • Decision boundary lines evolving
  • Final boundary plot
  • Error curve (should decrease!)